European Journal of Human Genetics — Latest Matching Preprints

1

Investigating the Y chromosome in complex disease: Phenome-wide scan across 104,334 Finnish men

Preussner, A.; Leinonen, J. T.; FinnGen, ; Pirinen, M.; Tukiainen, T.

2026-06-10 genetic and genomic medicine 10.64898/2026.06.09.26355235 medRxiv

Top 0.1%

37.1%

Show abstract

Although the Y chromosome represents roughly 2% of the male genome, it is often ignored in genome-wide association studies (GWAS). Subsequently, the potential health impacts of Y-chromosomal genetic variation remain incompletely understood. To fill this gap, we performed a phenome-wide association study (PheWAS) in FinnGen across 1,426 binary and quantitative traits using Y-chromosomal variation (frequency [≥] 1%) in 104,334 genotyped men. As Y chromosome variation is prone to population stratification, we performed carefully adjusted association analyses and further examined these through kin-based validation in 19,275 female and 24,712 male 1st degree relatives. We found 121 suggestive (p < 5.6x10-3) phenotypic associations in the Y chromosome, yet none of these were strong enough to reach phenome-wide significance (p < 3.9x10-6). While only 38 associations were supported in the kin-based validation, intriguingly we found support for a previously suggested link between haplogroup I1 and coronary heart disease (CHD; OR=1.06, 95%CI=1.02-1.11, p=3.7x10-3; male validation OR=1.05; female validation OR=0.97). The I1-CHD association was detected across distinct geographical areas within Finland and was independent from Loss of Y (LOY) and the autosomal risk to CHD, proposing a link between germline Y-chromosomal variation and heart disease risk. Overall, this study presents a comprehensive phenome-wide analysis of Y-chromosomal associations, highlighting the potential relevance of Y-chromosomal variation beyond sex determination. Our findings further emphasize the need for improved capture of Y-chromosomal variants and further analyses in biobank-scale data to allow for deeper exploration of male-specific genetic architecture of complex diseases.

2

Whole-exome-based preconception carrier screening in Uzbekistan with targeted SMA, FMR1, and DMD assays: the first reported clinical program

Kullyev, A.; Avdeichik, S.; Akimenkova, A.; Kartuesov, A.; Kardymon, O.; Goikhman, Y.

2026-06-04 genetic and genomic medicine 10.64898/2026.06.02.26354713 medRxiv

Top 0.1%

15.0%

Show abstract

Abstract Purpose: Published clinical outcome data on preconception carrier screening (PCS) in Central Asia are limited. We report the first clinical implementation study from Uzbekistan of a whole-exome sequencing (WES)-based multi-platform PCS program combining exome sequencing with targeted SMA, FMR1, and DMD assays. Methods: We retrospectively analyzed anonymized data from 65 individuals (19 couples, 27 singletons) screened at IMC Genomics, Tashkent, between January 2024 and May 2026. WES covering the protein-coding regions of approximately 20,000 genes was followed by exome-wide bioinformatics filtering and clinical geneticist interpretation. Partly overlapping cohorts underwent SMA carrier screening (n=179), FMR1 CGG-repeat analysis in females (n=155), and DMD deletion/duplication testing in preconception females (n=29). Variants were classified by ACMG/AMP criteria against gnomAD v4.1. Results: Sixty-one of 65 WES-screened individuals (93.8%; 95% CI 85.2 - 97.6%) carried at least one reportable variant (152 instances across 126 genes). Four of 19 couples (21.1%; 95% CI 8.5 - 43.3%) were concordant for pathogenic or likely pathogenic variants in the same autosomal recessive gene; two were referred for preimplantation genetic testing for monogenic disease. SMA screening identified four carriers, including two 2+0 silent carriers; FMR1 analysis identified one intermediate allele; DMD MLPA identified no exonic rearrangements. Conclusion: This first reported WES-based multi-platform PCS program in Uzbekistan was feasible and clinically informative, identifying actionable couple-level reproductive risks and supporting structured implementation of reproductive genetic screening in Central Asia.

3

Artificial Intelligence-Based Chatbots in Genetic Counseling Practice: Current Uptake, Utilization, and Perspectives

Daley, N.; Griswold, A.; Moreno, L.; Floyd, A.; Duong, D.; Solomon, B. D.; Waikel, R. L.

2026-05-24 genetic and genomic medicine 10.64898/2026.05.21.26353789 medRxiv

Top 0.1%

12.8%

Show abstract

AI-driven chatbots have been utilized in healthcare to automate administrative tasks, improve patient education, and expand access to medical information; however, their role in genetic counseling remains underexplored. To investigate the adoption, perceptions, and potential utility of AI-based chatbots in genetic counseling practice, 217 genetic counselors and genetic counseling students from across North America were surveyed regarding chatbot usage, confidence in their application, and perceived benefits and limitations. While most participants (166/217; 76.5%) reported using general AI chatbots outside of clinical settings, far fewer (18/204; 8.8%) reported using or recommending clinical genetics chatbots in clinical practice. For those that used clinical genetics chatbots, the primary purpose was for communication with at-risk family members (11/18; 61.1%) and patient education (10/18; 55.6%). Confidence in chatbot technology varied, with highest confidence in gathering family history information (81/199; 40.7%) and lowest confidence in their ability to disclose variants of uncertain significance or positive genetic testing results (5/199; 2.5%). The greatest perceived benefits included reducing repetitive tasks (165/195, 84.6%) and allowing for time for other tasks (141/195; 72.3%), while major concerns revolved around patient comprehension (167/195; 85.6%) and having accurate, up-to-date information (145/195; 74.4%). Despite some concern about AI replacing human counselors, most participants reported they felt there was potential for chatbots to enhance workflow efficiency (128/195; 65.6%) if properly integrated and regulated. Limited AI training was identified as a barrier to adoption (16/195; 8.2% received training), highlighting a need for structured education on AI applications in genetic counseling. These findings suggest that AI chatbots hold promise as supplementary tools, but significant challenges must be addressed before widespread implementation in genetic counseling practice.

4

Is it time for a paradigm shift? Tailored online video education instead of pretest genetic counseling facilitates high genetic test uptake and informed choice for adults seeking cardiovascular genetic testing

Rivers, B.; Murray, B.; Applegate, C. D.; Tichnell, C.; Gordon, C.; McClellan, R.; Brown, E.; Nunez, K.; Barth, A. S.; Taylor, C. O.; Yanek, L. R.; Day, J.; James, C. A.

2026-06-01 genetic and genomic medicine 10.64898/2026.05.28.26354394 medRxiv

Top 0.1%

8.9%

Show abstract

Background: Pretest genetic counseling (GC) is recommended in conjunction with genetic testing (GT) for cardiovascular (CV) indications, yet access to CVGC is limited leading to delayed GT. Posttest GC could increase GC and GT access but requires efficient pretest education that supports both informed GT decision-making and robust GT uptake. Methods: We developed four indication-tailored online CV genetics education videos and deployed them in a 3-arm randomized trial comparing pretest vs. posttest outpatient CVGC (RESEQUENCE-GC, NCT05422573). Participants were 1:1:1 randomized to pretest video education plus an optional (efficiency arm) or required (flipped arm) phone call with a genetic counselor and planned posttest CVGC or to standard pretest CVGC (SOC arm). Questionnaires administered at baseline and post-education included the CV Multidimensional Model of Informed Choice [MMIC] to quantify GT knowledge and informed GT choice. Results: 389/767 (50.7%) adults aged 18-80 (mean 51.2{+/-}14.9 years) scheduling a first CVGC appointment consented to RESEQUENCE-GC and completed the baseline questionnaire. Efficiency arm participants (video education + optional phone call) were most likely to complete pretest education (134, 97.4% efficiency; 107, 85.6% flipped; 111, 87.4% SOC, p=0.0012) and elect GT (131, 95.6% efficiency; 105, 84.0% flipped; 107, 84.2% SOC, p=0.0036). Few (4, 2.9%) efficiency arm participants requested an optional pretest phone call. Most flipped arm participants (90, 84.1%) had no post-video questions, consistent with the 97 second [IQR: 65s-145s] median call duration. CV genetics knowledge was high post-education (median 8 [IQR 7,8]/8 MMIC items correct). Only video-based pretest education was associated with a significant increase in knowledge (p<0.0001). Nearly all participants made an informed GT choice with no difference between intervention (95.6%) and SOC (90.4%) arms (p=0.074). Conclusions: Tailored, online video pretest education can enhance CV GT uptake, support informed GT decision-making, and be integrated into efficient pretest workflows, suggesting utility in scalable posttest CVGC.

5

Not Forgotten: Patient Experiences with Genetic Variant Reclassifications

Gupta, P.; Park, M. S.; Kao, E. Y.; McEwen, A. E.; Kumar, R. D.; Horike-Pyne, M.; Fowler, D. M.; Starita, L. M.; Knerr, S.; Stergachis, A. B.

2026-05-17 genetic and genomic medicine 10.64898/2026.05.06.26352483 medRxiv

Top 0.1%

7.8%

Show abstract

Purpose: Genetic variant reclassification is increasingly common in clinical genomics, yet limited data describe how patients experience re-contact and variant reclassification in routine clinical care. Methods: We conducted semi-structured qualitative interviews with 20 adult patients who received a variant reclassification following routine clinical genetic testing. Interviews explored emotional responses, communication experiences, and perceived value of genetic testing. Data were analyzed using Template Analysis, a form of thematic analysis. Results: Three overarching themes were identified. Participants identified a need for improved communication of reclassified results, particularly with respect to timing, modality, and contextualization (Theme 1). Experiences with reclassification also shaped perceptions of the value of genetic testing, with most participants viewing testing as worthwhile despite its evolving nature (Theme 2). Finally, many participants interpreted reclassification as evidence of personalized and ongoing care, reinforcing trust in genetic testing and biomedical research (Theme 3). Participants generally preferred to be informed of reclassified results regardless of reclassification type, although the direction of reclassification influenced emotional responses and preferred modes of communication. Downgrades from variants of uncertain significance to benign or likely benign were widely viewed as meaningful by participants. Conclusion: Variant reclassification was experienced as a signal of personalized, ongoing care. Timely, contextualized, patient-centered re-contact practices may reduce uncertainty, strengthen trust, and help patients not feel forgotten.

6

A Common Pathogenic Founder Variant in Rwandan Breast Cancer Cases

Manirakiza, A. V.; Baichoo, S.; Uwineza, A.; Dukundane, D.; Rugengamanzi, E.; Mutamuliza, J.; Niragira, A.; Muvunyi, R.; Besada, J.; Nielsen, S.; Bucknor, B.; Koeller, D. R.; Andrews, C.; Mutesa, L.; Fadelu, T.; Rebbeck, T. R.

2026-05-29 genetics 10.64898/2026.05.26.727861 medRxiv

Top 0.1%

6.7%

Show abstract

Germline data from African populations remain sparse, limiting characterization of population-specific BRCA1/2 pathogenic variants. In a study of 175 Rwandan women with breast cancer, 7 unrelated carriers (4% of cases; 22% of pathogenic variant carriers) harbored the same BRCA1 frameshift variant, c.4065_4068del (p.Asn1355Lysfs*10), which is extremely rare in gnomAD yet recurrent in European, Asian, and Middle Eastern cohorts. Whole-exome sequencing and haplotype analysis of all 7 carriers revealed a shared ancestral block of approximately 581 kb surrounding the variant, and extended haplotype homozygosity and network analyses confirmed a common founder origin. Coalescent-based age estimation placed the founder event approximately 4,000--10,000 years ago. Comparison with 1000 Genomes Project data showed the founder haplotype is absent or exceedingly rare outside African and South Asian populations. These findings strongly suggest the c.4065_4068del variant as a pre-historical BRCA1 founder variant in Rwanda, with implications for targeted genetic testing, cascade screening, and cancer prevention in the region.

7

Psychometric Validation of the Education and Assessment of Genetic Literacy (EAGL) Measure

Barna, L. S.; Liao, Y.; Wierbicki, M.; Ramirez-Renta, G. M.; Kaphingst, K.; Gunter, C.

2026-05-26 genetics 10.64898/2026.05.22.727229 medRxiv

Top 0.1%

6.5%

Show abstract

Genetic literacy is an integral measure for examining societys interaction with genetics, but widely-used "genetic literacy" measures lack both knowledge comprehension measures and psychometric validation. To address these issues, we validated the Education and Assessment of Genetic Literacy measure (EAGL) in a sample of 2708 US participants, using both exploratory and confirmatory factor analysis. In addition to standard subjective and objective knowledge subscales, our measures distinct knowledge comprehension subscale focuses on autism as an example of a complex condition. Regression analyses showed a statistically significant interaction when looking at education and personal connection to autism in relation to knowledge comprehension (F=3.68, p=0.003). Separately, those in our sample with a connection to autism scored higher on the subjective knowledge section (F=19.52, p<0.001) only, concurring with previous demonstrations of a subjective-objective knowledge gap in science literacy. We explored geographic location as one potential factor in genetic literacy and found that metropolitan vs non-metropolitan status had no significant main effects on overall levels. After the validation process, we have two multi-domain measures which accurately capture the construct of genetic literacy and are available for wide use: the multi-faceted EAGL-long, which has previously been tested in thousands of participants, or the validated three-factor EAGL-short.

8

Biallelic CYB5A disruptions in 46,XY Disorder of Sex Development: Identification and Characterization of a Novel Deep Intronic Variant

Moradifard, S.; LE, T. N. U.; Ha, N. T.; Dung, V. C.; Thao, B. P.; Harley, V. R.

2026-05-12 genetic and genomic medicine 10.64898/2026.05.05.26352416 medRxiv

Top 0.1%

6.4%

Show abstract

BackgroundThe diagnostic yield for 46,XY disorders of sex development (DSD) remains limited. Whole-genome sequencing (WGS) improves detection of both coding and non-coding variants that may be missed by routine testing. Cytochrome b5, encoded by CYB5A, is an essential co-factor for CYP17A1-mediated 17,20-lyase activity. We report on WGS on a Vietnamese family with 46,XY DSD with two siblings presenting with female external genitalia. MethodsClinical assessment and hormone profiling were conducted. WGS was conducted on peripheral blood DNA, in two affected siblings followed by variant annotation and ACMG-based classification. A minigene RNA splicing assay in HEK293 cells was used to evaluate the functional impact of the CYB5A intronic variant. ResultsThe patients hormone profile showed low testosterone and estradiol. WGS identified compound-heterozygous CYB5A variants: a paternally inherited missense variant (p.Val34Glu, likely pathogenic) and a maternally inherited deep intronic deletion (c.129+862_129+863del) for which SpliceAI predicted aberrant splicing. Minigene assays confirmed that the intronic deletion creates cryptic splice sites, resulting in pseudoexon inclusion and a premature stop codon, consistent with nonsense-mediated decay. The intronic variant meets ACMG criteria for pathogenicity. ConclusionThis family expands the spectrum of CYB5A-related DSD and demonstrates that compound-heterozygous variants, including deep intronic defects, can lead to a disruption in 17,20-lyase activity. These findings highlight the importance of WGS and functional assays for identifying clinically relevant non-coding variants in DSD.

9

Fertility rates across generations in twins and singletons: A total population study in Finland

Nieme de Paiva, S.; Hukkanen, M.; Latvala, A.; Kaprio, J.; Zellers, S.

2026-05-22 sexual and reproductive health 10.64898/2026.05.20.26353670 medRxiv

Top 0.1%

6.3%

Show abstract

Study question: Does twin status and zygosity (monozygotic vs. dizygotic; same-sex vs. opposite-sex) predict fertility outcomes and intergenerational reproductive patterns compared with singletons? Summary answer: Among females, dizygotic twins had modestly higher completed fertility than singletons and monozygotic twins and were more likely to have a twin birth. Fertility did not differ meaningfully among males. These differences were restricted to the twin generation and did not persist in the next generation, indicating sex-specific and generation-specific effects rather than intergenerational transmission. What is known already: Dizygotic twinning is associated with heritable hyperovulation and higher natural fertility but less is known about whether being a twin or zygosity influences reproductive outcomes across generations. Study design, size, duration: A population-based longitudinal cohort study using part of the Finnish Twin Cohort and national population registers. Participants included monozygotic (MZ; N = 4,068), same-sex dizygotic (SSDZ; N = 8,890), opposite-sex dizygotic (OSDZ; N = 8,474) twins, and singleton controls (N = 1,193,404) born between 1945-1957 (total N =1,254,103; 49.1% female), their mothers, their children, and their grandchildren. Participants/materials, setting, methods: Fertility outcomes (number of biological children, age at first birth, childlessness, multiple births) were derived from Finnish population registers. Analyses followed a preregistered plan (https://osf.io/qbwv3) Main results and the role of chance: Differences in fertility between singletons and twins were modest and varied by sex and zygosity. Differences were observed generally in the mothers of twins and female twins themselves, with limited differences in the offspring of twins as compared to the offspring of singletons. Twins were slightly older at first birth, had fewer total biological offspring, but were more likely to have a twin birth. Dizygotic twins in particular differed from monozygotic twins and singletons. Limitations, reasons for caution: Findings are limited to individuals born in mid-20th-century Finland and thus generalizability to recent populations or non-Nordic contexts may be restricted. Further, analyses are observational, and causal inference is limited due to alternative motivation behind fertility rates like social or cultural reasons. Wider implications of the findings: These findings suggest that zygosity and sex interact to shape reproductive outcomes, offering insight into genetic and environmental contributions to fertility. They highlight the value of large twin cohorts for studying intergenerational reproductive trends and the representativeness of twins in population-based fertility research.

10

Comprehensive analysis of de novo variants across 2,497 orofacial cleft trios reveals novel genetic drivers of disease

Kurtas, N. E.; Sanchis-Juan, A.; Shin, E.; Curtis, S. W.; Robinson, K. R.; Lee, A. S.; Alade, A. A.; Zhao, X.; Fu, J.; Diaz Perez, K. K.; Gowans, J. J. L.; Eshete, M. A.; Adeyemo, W. L.; Buxo, C. J.; Padilla, C. D.; Poletta, F. A.; Carreno Torres, A.; Wehby, G. L.; Hecht, J. T.; Moreno Uribe, L. M.; Mukhopadhyay, N.; Shaffer, J. R.; Weinberg, S. M.; Murray, J. C.; Beaty, T. H.; Butali, A.; Talkowski, M.; Marazita, M. L.; Leslie-Clarkson, E. J.; Brand, H.

2026-05-24 genetic and genomic medicine 10.64898/2026.05.21.26352934 medRxiv

Top 0.2%

4.4%

Show abstract

Background Orofacial clefts (OFCs) and other palate abnormalities (PAs) are among the most common birth defects worldwide and are characterized by the abnormal formation of the lip and/or palate. Genetic studies have traditionally classified OFC cases as either syndromic, involving OFCs alongside other congenital anomalies, or nonsyndromic, which represent the majority of cases and occur in isolation. Emerging genomic evidence indicates that genes traditionally associated with syndromic forms of OFC can also harbor variants contributing to isolated cases, challenging the notion of a strict dichotomy between these categories and supporting their integration for gene discovery. Methods In this study, we applied multiple analytic approaches to characterize the genetic architecture of OFC and PAs by integrating genomic data from 2,497 trios with an OFC (n=2080) and PA (n=417) affected proband. We compared these findings across OFC subtypes and syndromic status with those from 5,515 control trios to identify enriched biological pathways and mechanisms and to prioritize candidate genes using variant burden testing. Results We observed a significant enrichment of de novo protein-truncating and damaging missense variants in cases compared to controls (OR = 2.17, p = 1.21x10-32), with particularly strong signals in biologically relevant gene sets involving OFC-associated, constrained, Mendelian disorder, and mouse candidate genes. Variant burden testing identified 39 OFC risk genes at FDR [≤] 0.05, which we then integrated with 593 established OFC genes to interrogate the functional underpinnings of OFC via network analysis. This analysis revealed 309 high-order interactor genes not previously associated with OFC. Notably, this OFC network clustered into ten distinct biological pathways, with nucleosome-associated genes showing significant enrichment among cases in our cohort (OR = 14.8, p = 8.1x10-4). In a final integrative step, we combined evidence across all analyses to nominate 231 candidate genes, 32 of which contained at least two deleterious de novo variants in our cohort. Conclusions These findings underscore the value of integrating diverse OFC and PA subtypes, syndromic status, and variant classes to refine the genetic architecture of these disorders, highlighting both phenotypic expansion of known disease genes and the emergence of novel gene-phenotype associations.

11

Improved prostate cancer prediction by combining Prostate-Specific Antigen (PSA) test results with Genetic Risk Scores (GRS/PRS)

Lu, J.; Chen, G.; Merriel, S. W. D.; Weedon, M. N.; Murray, A.; Bailey, S. E. R.; Green, H. D.

2026-05-18 genetic and genomic medicine 10.64898/2026.05.14.26353195 medRxiv

Top 0.2%

4.1%

Show abstract

Background: Prostate cancer is the second most common cancer in men worldwide. The Prostate Specific Antigen (PSA) blood test is widely used for prostate cancer detection but suffers from high false-positive rates (up to 80%). Genetic risk scores (GRS/PRS) have a similar performance to PSA testing in predicting prostate cancer risk. Method: GRS269 for prostate cancer was derived using 269 known risk variants and applied to UK Biobank participants. We assessed whether GRS269 improved power to predict prostate cancer diagnosis on top of age and pre-prostatectomy PSA level among 17,380 cases. Longitudinal PSA measurements were processed as median, first, last (most recent), and random PSA. All models were adjusted for age. Results: Across all PSA measures, the integrated model combining GRS269, PSA, and age consistently outperformed models using GRS269 or PSA alone. The highest predictive performance was observed using the last PSA value combined with GRS269 (AUC = 0.82, 95% CI: 0.81-0.82), compared to GRS269 alone (AUC = 0.70, 95% CI: 0.68-0.72) or PSA alone (AUC = 0.73, 95% CI: 0.70-0.75). Conclusion: Combining genetic risk with PSA and age improves prostate cancer risk prediction in a population setting. These findings highlight the potential clinical implications of integrating GRS will enhance early prostate cancer prediction pathways in primary care.

12

Differential causative effects of germline pathogenic variants in MUTYH and PALB2 in a patient with colorectal polyposis and breast cancer

Camacho Valenzuela, J.; Pelletier, D.; Polak, P.; Fu, L.; Hamel, N.; Domecq, C.; Ahmed, A.; Robles-Espinoza, C. D.; Foulkes, W. D.

2026-05-25 genetic and genomic medicine 10.64898/2026.05.15.26352890 medRxiv

Top 0.2%

3.9%

Show abstract

Purpose Patients carrying Germline Pathogenic Variants (GPVs) in multiple cancer susceptibility genes (CSGs) can be described within the context of Multi-locus Inherited Neoplasia Allele Syndrome (MINAS). The role of each GPV is typically interpreted based on clinical phenotypes. Here, we used tumor sequencing, particularly mutational signatures, to investigate the contribution of GPVs in MUTYH and PALB2 to colorectal polyposis and breast cancer in a single patient at a molecular level. Methods We analyzed tumor sequencing data, including mutational signatures and genomic scars, of a breast tumor and a colorectal polyp from a patient with biallelic GPVs in MUTYH and a heterozygous GPV in PALB2. Results The colorectal polyp showed a dominant contribution of MUTYH-associated Base Excision Repair deficiency (BERd) mutational signatures, with no evidence of Homologous Recombination Repair Deficiency (HRD). In contrast, the breast tumor showed both MUTYH-driven BERd and HRD-associated signatures, including SBS3, ID6 and an elevated HRD score, despite the absence of a detectable second hit in PALB2. These findings suggest a differential contribution from the CSGs, with MUTYH contributing to both lesions and PALB2 contributing specifically to the breast tumor. The observed pattern does not align with the additive or synergistic models described in MINAS. Conclusions Our study provides evidence that mutational signatures can elucidate the contribution of multiple CSGs to tumorigenesis within a single patient. These findings extend current interpretations of MINAS beyond additive or synergistic phenotypes, which may help to better understand tumor etiology, with potential clinical implications, including eligibility for targeted therapies.

13

Integrating enriched case data from national laboratory testing with population-based case-control analyses: a novel statistical likelihood-ratio methodology for PS4 applied to 325,345 breast cancer cases and 671,006 controls

Allen, S.; Rowlands, C. F.; Garrett, A.; Couch, F.; Richardson, M. E.; Pesaran, T.; Pethick, J.; Lavelle, K.; McRonald, F.; Vernon, S.; Torr, B.; Loong, L.; Aungraheeta, R.; Durkie, M.; Burghel, G. J.; Callaway, A.; Robinson, R.; Field, J.; Frugtniet, B.; Palmer-Smith, S.; Grant, J.; Pagan, J.; McDevitt, T.; Snape, K.; Hanson, H.; McVeigh, T.; Loveday, C.; Jones, M.; Hardy, S.; Turnbull, C.; CanVIG-UK,

2026-05-17 genetic and genomic medicine 10.64898/2026.05.13.26353095 medRxiv

Top 0.3%

3.7%

Show abstract

Background: For many evidence criteria within v3.0 of the ACMG/AMP guidelines, methodologies have been developed to empower their use outside the stipulated evidence strengths. However, no such methodology has been established for case-control data (PS4). With the release of large-scale unselected case-control datasets and expansion of nationally-collected laboratory datasets enriched for pathogenic variant carriers, there is potential to combine datasets across ascertainment contexts in a more quantitative manner using novel likelihood ratio tools. Methods: Using our published PS4-LR-Calculator, we calculated a combined log likelihood ratio (PS4-LLR) across five datasets (three unselected, and two enriched), and estimated enrichment of pathogenic variants in clinically-ascertained laboratory data using truncating variant prevalence. Results: Data were combined for 10,817 missense variants from 325,345 female breast cancer patients and 671,006 controls of Western European ancestry for five breast cancer susceptibility genes (BRCA1, BRCA2, PALB2, ATM, CHEK2). A combined LLR was produced for 4,690 missense variants; 927 variants received evidence towards pathogenicity (LLR[≥]1), and 3,242 received evidence towards benignity (LLR[≤]-1). Conclusion: This flexible, variant-level methodology combines nationally-collected 'enriched' datasets with unselected case-control cohorts, expanding the available information for case-control analysis, boosting power, enabling exploration of atypical penetrance and empowering variant classification.

14

Effects of Starting and Stopping Combined Oral Contraceptives on Markers of Ovarian Reserve

Bernig, U.; Kördel, M.; Sundström-Poromaa, I.; Kroemer, N. B.; Henes, M.

2026-06-01 sexual and reproductive health 10.64898/2026.05.29.26354411 medRxiv

Top 0.3%

3.7%

Show abstract

Objective To examine the effects of combined oral contraceptive (OC) use on clinical markers of ovarian reserve by comparing Anti-Muellerian Hormone (AMH), antral follicle count (AFC), and ovarian volume (OV) before and after starting or stopping OC. Methods This analysis is based on data from a prospective cohort study conducted at the University Hospital Tubingen, Germany, as part of the IRTG-2804 project. A total of 54 healthy women were included and categorized into three groups based on their OC use status: OC starters (n = 12), stoppers (n = 16), and long-term OC-users (n = 26). Each participant underwent a transvaginal ultrasound (including AFC and OV) and serum sampling (including AMH) at two time points (S1 and S2), three to six months apart. OC starters were assessed first during the early follicular phase (day 1-7) and then during active OC intake (day 8-21), while stoppers were assessed in the reverse order. Long-term users were assessed twice during active OC intake. Results OC stoppers showed significant within-group increases in all ovarian reserve markers, including AMH ({Delta} = 2.57 ng/mL, p < .001), AFC ({Delta} = 3.88, p = .004), and OV, which almost doubled (1.94-fold increase; 95% CI [1.35, 2.80], p < .001). In contrast, OC starters exhibited a significant decline in AMH ({Delta} = -1.25 ng/mL, p = .013), but no changes in AFC or OV. No significant longitudinal changes were observed among long-term OC users. Conclusion AMH levels decrease after starting OC use whereas AFC and OV are not affected. In contrast, AMH, AFC, and OV recover within three to six months after stopping OC, suggesting a reversible suppression of ovarian reserve markers during OC use. These findings are clinically relevant for fertility counseling and for the interpretation of ovarian reserve markers in women using hormonal contraception.

15

Large-scale association study identifies lung cancer susceptibility copy number variants and their potential functional role in genetic instability

Xiao, F.; Qin, F.; Luo, X.; Slewitzke, S. E.; Fernandes, G. F.; Johansson, M.; Xiao, X.; Zaridze, D.; Bojesen, S. E.; Shete, S.; Albanes, D.; Aldrich, M. C.; Tardon, A.; Fernandez-Tardon, G.; Le Marchand, L.; Rennert, G.; Bickeböeller, H.; Wichmann, H.-E.; Risch, A.; Muley, T.; Rosenberger, A.; Field, J. K.; Davies, M.; Woll, P.; Kiemeney, L. A.; Haugen, A.; Zienolddiny, S.; Lam, S.; Johansson, M.; Grankvist, K.; Schabath, M. B.; Andrew, A.; Lazarus, P.; Arnold, S. M.; Zhu, D.; Brenner, H.; Neuhouser, M. L.; Hung, R. J.; Christiani, D. C.; McKay, J.; Cai, G.; Xia, J.; Amos, C. I.

2026-05-15 genetic and genomic medicine 10.64898/2026.05.11.26352741 medRxiv

Top 0.3%

3.6%

Show abstract

Background: Genome-wide association studies (GWAS) have identified numerous lung cancer susceptibility loci based on single nucleotide polymorphisms (SNPs), yet a substantial proportion of heritability remains unexplained. We therefore evaluated germline copy number variants (CNVs) as an underexplored source of genetic susceptibility and potential contributors to genomic instability in lung cancer. Methods: We conducted a genome-wide analysis of germline CNVs using 19,342 cases and 15,917 controls from the Transdisciplinary Research in Cancer of the Lung (TRICL) consortium, with replication in two independent cohorts. High-confidence CNVs were identified by integrating two CNV callers including PennCNV and modSaRa2. Association analyses were performed using both gene-based and CNV region-based approaches. Polygenic risk scores (PRS) were constructed from top loci, and functional validation was conducted using siRNA-mediated knockdown in lung fibroblast cells. Results: We identified CNVs in four genomic regions (1p36.22, 2q31.2, 6p21.32, and 19q13.32) significantly associated with lung cancer risk. Two loci (1p36.22 and 2q31.2) were consistently supported across both analytical strategies. A CNV-based PRS constructed from key genes (CLCN6, NFE2L2, OPA3, and PSMB8) was significantly associated with lung cancer risk and replicated across independent datasets. Functional assays demonstrated that knockdown of NFE2L2 and OPA3 increased endogenous DNA damage, supporting a role in genomic stability. Conclusions: Germline CNVs contribute to lung cancer susceptibility and may influence carcinogenesis through mechanisms related to genomic instability. Impact: These findings expand the genetic architecture of lung cancer and highlight CNVs as potential biomarkers for improving risk stratification and informing precision prevention strategies.

16

Association of a polygenic risk score with coronary atherosclerotic burden in clinical CT angiograms

Hartmann, K.; Gannon, M.; Natarajan, P.; Greenland, P.; Biobank, P. M.; Levin, M.

2026-05-27 genetic and genomic medicine 10.64898/2026.05.26.26353801 medRxiv

Top 0.3%

3.6%

Show abstract

Background: Polygenic risk scores (PRS) for coronary artery disease (CAD) are associated with cardiovascular events, but the relationship between inherited risk and routinely reported coronary computed tomography angiography (CTA) findings has not been studied. Objectives: To evaluate associations between a genome-wide PRS for angiographic coronary disease burden and coronary CTA-derived measures of atherosclerotic severity in a real-world clinical cohort. Methods: We studied Penn Medicine BioBank participants with available genotypes and clinically obtained coronary CTA reports. A previously published PRS for angiographic CAD burden was calculated using pgsc_calc. CAD-RADS scores and coronary artery calcium (CAC) values were extracted from radiology reports using the large language model Llama 3.1 8B. Associations between PRS and CAD-RADS severity were evaluated using Bayesian cumulative ordinal logit regression, while associations with log-transformed CAC burden were assessed using Bayesian linear regression. Results: Among 630 participants, median age was 59 years (IQR 49 - 68), 53% were female, 62% were genetically similar to a European reference population, and 34% to an African reference population. LLM-extracted CAD-RADS and CAC values demonstrated near-perfect agreement with manual abstraction. Higher PRS was associated with greater coronary atherosclerotic burden on CTA. Each 1-standard deviation (SD) increase in PRS was associated with a 20% higher odds of belonging to a more severe CAD-RADS category (cumulative OR 1.20, 95% credible interval 1.06-1.44). Higher PRS was also associated with greater CAC burden ({beta} 0.38, 95% credible interval 0.15 - 0.61). Conclusions: Polygenic risk for angiographic coronary disease burden is reflected in clinically reported coronary CTA severity measures, including CAD-RADS and CAC. These findings demonstrate that inherited susceptibility to CAD manifests as greater anatomic atherosclerotic burden at the time of clinical presentation and support further investigation of genetic risk integration into imaging-based cardiovascular risk assessment.

17

Healthcare professionals' perspectives on a multilevel cardiovascular risk management intervention (PROSPERA programme)

Bongaerts, V. A. M. C.; van Gestel, L. C.; van Peet, P. G.; Vuijk, M.-L. S.; Hageman, S. H. J.; Dorresteijn, J. A. N.; Bonten, T. N.; Numans, M. E.; van Os, H. J. A.; Vos, R. C.

2026-06-09 cardiovascular medicine 10.64898/2026.06.08.26355169 medRxiv

Top 0.3%

3.5%

Show abstract

Background: Two-thirds of Dutch cardiovascular risk management (CVRM) for patients at risk of cardiovascular disease is delivered in primary care practices. While individual risk scores are increasingly used during consultation, a population-level structure for risk-based patient outreach is not currently available. We therefore developed the PROSPERA programme, a multilevel intervention comprising population-level risk stratification and individual-level support tools. Aim: To assess anticipated and experienced barriers and facilitators among healthcare professionals (HCPs) to inform implementation in primary care. Methods: We conducted four focus groups and six interviews with nine primary care HCPs to explore anticipated and experienced barriers and facilitators. Inductive codes were thematically analysed and assigned to corresponding domains of the Theoretical Domains Framework (TDF) and the related Capability, Opportunity, Motivation model of Behaviour. Results: Barriers and facilitators were identified in 11 TDF domains. Population-level barriers included altered professional roles and limitations in technological infrastructure. Individual-level barriers were limited skills in interpreting risk calculations and difficulty integrating tools into clinical routine. Facilitators were related to beliefs on the importance of providing proactive care (population level), the use of U-Prevent for risk communication (individual level) and positive patient responses to the Lifestylecheck questionnaire (individual level). Conclusion: Addressing barriers and facilitators identified at both the population and individual levels can support implementation of the PROSPERA programme. Opportunities exist in education and training of HCPs in risk communication, as well as support in restructuring the physical and digital environment.

18

Targeted BRCA1/BRCA2 Sequencing in a Bangladeshi Clinically Referred Cohort Identifies Candidate BRCA1 Loss-of-Function Variants and a Multi-Exon Deletion-Like CNV Signal

Al Sium, S. M.; Banu, T. A.; Goswami, B.; Naser, S. R.; Habib, M. A.; Akter, S.; Ara, M. H.; Al Din, S. M. S.; Nafisa, A.; Nayem, M. R.; Rabbi, M. F. A.; Sarkar, M. M. H.; Khan, M. S.

2026-05-20 oncology 10.64898/2026.05.11.26352643 medRxiv

Top 0.3%

3.5%

Show abstract

Background: Population-relevant BRCA1/BRCA2 data from Bangladesh are scarce, creating challenges for hereditary breast and ovarian cancer variant interpretation, counseling, and follow-up testing. We examined a clinically referred Bangladeshi cohort to characterize assay-derived BRCA1/BRCA2 short variants, sequencing-depth performance, and copy-number findings in a conservative pilot framework. Methods: Twenty-three de-identified blood-derived DNA samples were assessed using a targeted BRCA1/BRCA2 next-generation sequencing workflow. Downstream analysis used assay-generated short-variant, coverage, and CNV outputs, with coordinates reported on hg19/GRCh37. Short variants were evaluated from high-confidence PASS/VCC-H calls, and CNV review incorporated both target-region and amplicon-level copy-number patterns. Results: After removal of four low-VAF review observations, the primary germline-compatible dataset comprised 304 short-variant observations representing 34 unique variants. Both BRCA1 and BRCA2 contributed comparable variant burdens, while the overall profile was mainly composed of missense and synonymous changes. Six sample-specific heterozygous BRCA1 truncating candidates were observed, including five frameshift variants and one stop-gain variant. Protein-level mapping placed these events across the central-to-C-terminal portion of BRCA1. Sequencing depth was consistently high across the targeted regions, with all 4,255 amplicon-sample measurements exceeding 280x and 99.91% reaching at least 500x. Copy-number analysis highlighted one candidate BRCA1 multi-exon deletion-like event involving exons 15-20 in BCSIR-BRCA-21, with unresolved partial exon 14 involvement. Conclusions: This study provides an initial Bangladesh-focused targeted BRCA1/BRCA2 dataset and identifies candidate short-variant and CNV findings for validation. These findings should be interpreted as analytical candidates only and require confirmatory testing and expert clinical curation before any clinical application. The cohort is referral-enriched and should not be used to infer population prevalence.

19

Cultural affiliation accounts for most of the spatiotemporal variation in burial rite practices

Canteri, E.; Staniuk, R.; Timpson, A.; Schauer, P.; Bulatovic, J.; Ivanova-Bieg, M.; Reiter, S. S.; Rose, H. A.; Kolar, J.; Thomas, M. G.; Racimo, F.; Shennan, S.

2026-05-28 genetics 10.64898/2026.05.25.725982 medRxiv

Top 0.3%

3.5%

Show abstract

Describing and interpreting spatiotemporal patterns in human culture has been a central focus of anthropology and archaeology for over a century. Recent ethnographic studies have highlighted the complexity of the processes generating these patterns, including isolation-by-distance, homophily, and common descent. However, investigating these processes in prehistoric archaeology remains challenging. Here we make use of a new interdisciplinary database and a combined dataset of ancient DNA (aDNA) genomic sequences to analyse the relationship between spatiotemporal patterns in cultural and genomic variation, by testing whether broadly defined clusters of genomic affinities correspond to spatiotemporal changes in burial rites, while controlling for other factors, using a Gaussian process model. We use data from the Big Interdisciplinary Archaeological Database (BIAD), linking mortuary information from [~]4,200 individuals with genetic ancestry and mobility data inferred from over 1,300 human genomes, from Western Eurasia [~]10,000-2000 BP. By integrating and modelling these diverse datasets, we aim to provide a detailed understanding of how genomic history intersects with cultural evolution, offering new insights into the dynamics behind these complex processes, and the extent to which genes and culture are transmitted in parallel. In the case of burial orientation, we found that cultural affiliation was the main factor accounting for variation with little to no role for ancestry, while for body position the picture was more mixed but cultural affiliation also played an important role.

20

Conditional and marginal SNP-heritability to leverage ancestral and environmental diversity

Singh Sachan, A. N.; Schwartzman, A.; Azriel, D.

2026-05-29 genetics 10.64898/2026.05.28.728536 medRxiv

Top 0.4%

3.0%

Show abstract

SNP-heritability is defined as the fraction of variance of a trait that is explained by the SNPs in a genome-wide association study. Several methodologies have been proposed to estimate this quantity. More recent methods aim to do so with ancestrally diverse datasets and yet obtain a single heritability for an entire dataset, which we refer to as marginal heritability. However, the different underlying subpopulations that compose a genetically diverse dataset might have different environmental and genetic exposures, and thus may have different heritabilities. In order to address this, we propose a conditional SNP-heritability approach that allows to estimate multiple SNP-heritabilities on a dataset corresponding to different ancestral compositions and environmental exposures. We take a careful statistical approach, including estimation of conditional genetic and environmental variances, and calculation of standard errors via a combination of the delta method with bootstrapping. We validate our method via extensive simulations. We then apply it to an ancestrally and socio-economically diverse dataset of 6603 subjects aged around 9 to 11 from the Adolescent Brain Cognitive Development study, and illustrate how the SNP-heritability of intelligence scores can change due to differing extrinsic variances in different socio-economic groups, which coincides with previous work in the literature. This conditional estimation approach can be a valuable tool for understanding differences in risks across subpopulations. Our work here improves on existing methodology and allows us to leverage the heterogeneity of the data to obtain new insights.